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ABSTRACT 

Social coordination allows users to move beyond awareness 
of their friends to efficiently coordinating physical activities 
with others. While specific forms of social coordination can 
be seen in tools such as Evite, Meetup and Groupon, we 
introduce a more general model using what we call enmeshed 
queries. An enmeshed query allows users to declaratively 
specify an intent to coordinate by specifying social attributes 
such as the desired group size and who/what/when, and 
the database returns matching queries. Enmeshed queries 
are continuous, but new queries (and not data) answer older 
queries; the variable group size also makes enmeshed queries 
different from entangled queries, publish-subscribe systems, 
and dating services. 

We show that even offline group coordination using en- 
meshed queries is NP-hard. We then introduce efficient 
heuristics that use selective indices such as location and time 
to reduce the space of possible matches; we also add re- 
finements such as delayed evaluation and using the relative 
matchability of users to determine search order. We describe 
a centralized implementation and evaluate its performance 
against an optimal algorithm. We show that the combina- 
tion of not stopping prematurely (after finding a match) and 
delayed evaluation results in an algorithm that finds 86% 
of the matches found by an optimal algorithm, and takes 
an average of 40 fisec per query using 1 core of a 2.5 Ghz 
server machine. Further, the algorithm has good latency, 
is reasonably fair to large group size requests, and can be 
scaled to global workloads using multiple cores and multiple 
servers. We conclude by describing potential generalizations 
that add prices, recommendations, and data mining to basic 
enmeshed queries. 

1. INTRODUCTION 

While social networks like Facebook are widely used for 
interacting with friends, they seem less useful for planning 
and coordinating with other people. But, this is a significant 



part of what being social entails — we go to birthday par- 
ties, we play sports, we fraternize with like-minded people 
on even obscure topics such as fly-fishing. Further, the de- 
fault method of coordinating with people by phone/email is 
tedious, and sometimes hard to drive to closure. Thus social 
coordination seems the next step beyond social networking. 

There are a number of ad hoc social coordination tools 
already in the market. These include evite.com (coordina- 
tor specifies list of invitees and an event), doodle . com (users 
specify a range of times for a meeting and the coordinator 
can look for common times), meetup.com (users specify a 
zipcode and a topic, and then browse through the groups 
returned by the system), foursquare.com (users check into 
a database of locations, and can query the system for the 
location of their friends), and groupon . com| (users specify 
a location and interest, and then browse through a set of 
deals; however, deals only take effect when a critical mass 
of users sign up for the deal). 

Existing social coordination tools have four disadvantages: 
1. Ad Hoc: Each system is tailor-made to a specific appli- 
cation although they have many abstract features in com- 
mon. 2. Limited Query Capability: While most provide 
a rudimentary query facility, they also involve considerable 
browsing and selection by the user. 3. Not Continuous: If 
there is no current choice that satisfies the user, the user 
must retry later: synchronous polling is needed as opposed 
to asynchronous notification by the system. 4- No group 
size control: Users cannot specify limits on the group size. 

In contrast, our goal is to develop tools for fluid social 
coordination which have the following properties. 

• Generic Platform: We abstract coordination as finding 
matches on key attributes such as people, activity, loca- 
tion, and time in a common platform that many social 
applications can use. 

• Declarative Queries: Users specify predicates that pre- 
scribe the kind of groups they wish: the system matches 
users to groups without need for browsing. 

• Continuous: If there is no current choice, the system will 
retain the user query and subsequently attempt to match 
this query when future queries enter. 

• Group Constraints: Users can specify group constraints 
(such as bounds on group size). 

By fluid, we mean that coordination is not limited to static 
groups such as friends; instead, we allow coordinating even 
with strangers, and with different sets of people for different 
activities. Granovetter |3| has argued that novel information 
flow (such as job opportunities) typically occurs through 



"weak" ties (e.g., casual acquaintances) than through "strong" 
ties (e.g., friends and family). We suggest that allowing peo- 
ple to perform fluid social coordination may encourage the 
formation of weak ties. 

Fluid coordination appears in a number of real world ap- 
plications. Consider multiplayer online gaming where users 
wish to form groups whose sizes can depend on the game. 
The user is indifferent to player identities except that they 
be close by (to reduce latency), and have similar Internet 
access speeds and game ratings. By allowing a user, of say 
Xbox LIVE, to specify parameters in these three key at- 
tributes and a group size (say 4 to play Halo), our system 
can match users up. As a second example, consider find- 
ing 3 partners among all Microsoft employees for playing 
doubles tennis in Sunnyvale at 4 pm on the weekend. Pri- 
vate data collection is a third example. Users may allow 
a hospital to publish their information only if they share 
the same "quasi-identifying" attribute with a group of ft — 1 
other users (fc-anonymity 11 ), or if additionally i distinct 
diseases appear in their group (^-diversity [8]). 

Complete fluidity may alarm some users, especially when 
considering physical activities such as tennis, where a user 
may not wish to play with another player, say, with a crimi- 
nal record. We allow users effectively to "scope" their coor- 
dination intents using constraints. For instance, a user can 
request to only play with other users affiliated with Duke 
University (i.e., with a duke.edu email suffix). Users can 
set the scope to be wide enough to match their query and 
yet narrow enough to stay in their comfort zone. 

In this paper, we present a system that can support fluid 
coordination as described in the above examples. In this sys- 
tem, users describe their coordination intent using a novel 
concept called enmeshed queries, which allow a user to spec- 
ify coordination constraints, like who/ what /when, and con- 
straints on coordinating group, like on the number of people 
forming a group. The formalization and scalable implemen- 
tation of these queries is the main contribution of this paper. 

From an algorithmic viewpoint, think of the set of en- 
meshed queries as forming the nodes of a graph. We place 
an edge from node R to node S if R is compatible with S. For 
example, the queries of two tennis players who have compat- 
ible ratings who wish to play at the same time and the same 
place will be linked by an edge. The query system attempts 
to carve this graph into cliques such that each clique satis- 
fies its group size constraints; for example, one clique could 
be a set of 4 compatible tennis players to play doubles. 
Contributions and Paper Outline: First, we formal- 
ize the problem of fluid social coordination by defining the 
concept of enmeshed queries. Second, we show that opti- 
mal algorithms for matching enmeshed queries are NP-hard; 
however, we introduce heuristics that are close to optimal 
on typical workloads. Third, we describe an implementation 
that can scale to millions of concurrent enmeshed queries. 

The rest of the paper is organized as follows. Section[2]de- 
scribes related work. Section [3] describes enmeshed queries, 
and theoretical hardness results are described in Section [4] 
Algorithms are described in Section [5] Section [5] presents 
an evaluation that suggests that our algorithms can scale to 
millions of queries per second. We conclude (Section]?]) with 
a list of future research directions. 



2. RELATED WORK 



Enmeshed queries are most closely related to entangled 
queries [4j |5 , 9 . The seminal paper on entangled queries [4] 
required users to explicitly specify partners they wish to co- 
ordinate with (e.g., tennis with John and Sarah). More re- 
cent work [9] allows some flexibility in choosing partners by 
allowing an entangled query to specify one partner to be any 
friend (or someone related to the user through a prespeci- 
fied binary relation). While enmeshed and entangled queries 
share the goal of declarative social coordination, there are 
significant differences. First, enmeshed queries allow group 
constraints such as bounds on group sizes; entangled queries 
do not. Second, enmeshed queries are designed for fluid co- 
ordination - partners need not be explicitly specified, and 
indeed may not even be friends. Third, enmeshed queries 
adopt online algorithms that match a newly arrived query 
with a subset of existing queries. Our online algorithms 
are designed to scale to millions of concurrent queries. In 
contrast, entangled queries consider (offline) coordination 
among a given small set of queries (1000s of queries in [4] 
[9]). Entangled queries, on the other hand, have consider- 
able power in specifying constraints on external relations 
that may be the final object of coordination (e.g., a ten- 
nis reservation at a specified court). An interesting future 
direction is to combine the power of these two formalisms. 

Pub-sub systems 10 and continuous query systems [12] 



also provide declarative continuous query evaluation, but 
each query is logically independent; further there are no con- 
straints on groups. On the other hand, enmeshed queries 
find matches among queries. Our approach to find candi- 
date query matches is related to work in efficiently evaluat- 
ing boolean expressions U]. Finally, enmeshed queries differ 
from nested transactions IV] , since at query time the system 
does not know which other enmeshed queries it is waiting 
on. Table [l] summarizes these comparisons. 

There is also related work on team formation in social 
networks 16] that studies the following problem: given a set 
of people (i.e. nodes in a graph) with certain skills and 
communication costs across people (i.e. edges in the graph), 
and a task T that requires some set of skills, find a subset 
of people to perform T with minimal communication costs. 
They prove that such problems are NP hard, and describe 
heuristics to reduce computation. Our problem is not the 
same as task formation because tasks are not explicit first- 
class entities in enmeshed queries but are, instead, implicit 
in the desires of users. Further, our metric is maximizing 
matches and not minimizing communication. 



3. MODEL AND PRELIMINARIES 

We formally define enmeshed queries and the problem of 
social coordination. 

Users: Consider a set of users U who wish to participate 
in coordination tasks. Each user is associated with a set 
of attributes Au- Examples of user attributes include age, 
home location, tennis rating, etc. Denote by dom(A) the 
domain of an attribute A, and by dom(A) = x AeAdom(A) 
the cross product of the domains of a set of attribtues A. 
A point x — [x 1 , . . . , Xk] is a multidimensional value from 
dom({A!, . . .,A k }). 

Enmeshed Queries: Users specify their intent to coor- 
dinate using one or more enmeshed queries. An enmeshed 
query q is associated with (i) a unique user q.user, and (ii) a 
set of free coordination variables q.x — [q.xi, q.X2, ■ ■ ■ , q.x\ C \]- 





Group Constraints 


Independent Queries 


Query Life 


# of queries 


Enmeshed 


Yes 


No 


Short 


Millions 


Entangled 


No 


No 


Short 


Thousands 


Continuous Queries 


No 


Yes 


Long 


Varying 


Pub-Sub 


No 


Yes 


Long 


Millions 



Table 1: Comparison among continuous queries, pub/sub systems, entangled queries and enmeshed queries. 



Each free variable q.xt takes values from a distinct coor- 
dination attribute Ai £ Ac- Examples of coordination 
attributes are time(when), location (where), activity type 
(what), etc. Enmeshed queries define coordination intent 
by specifying constraints. Formally, an enmeshed query is a 
triple (S, J , Q), where 5 is a set of selection constraints, J is 
a set of join constraints, and Q is a set of group constraints. 
We define S, J ', and Q in turn. 

First, 5 specifies the query's coordination intent S by 
specifying sets of possible values for each coordination vari- 
able. For instance, a user may want to play tennis or squash, 
between 4 and 6 PM, in either Cupertino or Mountain View. 
More formally, coordination intent 5 is specified as a con- 
junction of selection constraints, where each selection con- 
straint is of the form Xi £ S, where S C dom(Ai). Note that 
queries may not pose any constraint on some coordination 
attributes; this is modeled using S — dom(Ai). 

Second, J specifies additional join constraints over the 
user attribute values on pairs of enmeshed queries. Ex- 
amples include: Alice would like to coordinate only with 
her friends, and Alice would like to play tennis with users 
who have a higher rating. If Alice knew a priori that she 
was coordinating with Bob, then Alice could easily express 
the above rating constraint as Bob £ Alice. Friends and 
(Bob.rating > Alice. rating). However, when declaring co- 
ordination intent, Alice does not know who she is coordi- 
nating with. To describe join constraints, we introduce the 
notation that to refer to attributes of other users who may 
potentially coordinate on a task. Thus the above constraints 
in Alice's query q can be written as that.zd £ user. Friends 
and that, rating > user .rating. J is a conjunction of such 
individual join constraints. 

Finally, Q specifies predicates on allowable groups. For 
example, one could specify the average rating of the group 
of individuals for a doubles tennis game. For this paper, we 
focus on the simplest and most useful group constraint, a 
cardinality constraint such as "want to play tennis with at 
least 2 or at most 4 individuals". Formally, a cardinality 
constraint Q is expressed as card(g) £ [lb, ub], where [lb, ub] 
(lb > 2) is the set of integers < ub and > lb. 
Example: A user wanting to play tennis in Cupertino or 
Sunnyvale at 8 PM with 2,3 or 4 people with a rating 5 can 
be written as the following query: 

Aume £ [8PM] A Aiocation £ {Cupertino, Sunnyvale} A 
A S p 0r t £ {Tennis} A that. A ra tin g = 5 A card(q) £ [2, 4] 

Matching Queries: We define the semantics for how/when 
coordination is achieved in stages by adding in each type 
of constraint in turn. First, we say that a set of queries 
qi, . . . ,qk are jointly satisfied if there is a point p £ dom(Ac) 
such that for every query qi and selection constraint Xj £ Sij 
in qt, we have p., £ Sij. Next, we incorporate join constraints 
as follows. Two queries q\ and qi are said to match if (i) 
their selection constraints are jointly satisfied, and (ii) join 
constraints on qi and qi are satisfied. Finally, we add group 
cardinality constraints: 



Definition 1 (Committable Query Group). A group 
of queries Q C Q is called a committable query group if 

• Selection conditions on queries in Q are jointly satisfied, 

• Vgi , qi £ Q, qi and qi match on join constraints, and, 

• \Q\ satisfies the cardinality constraint of all q £ Q. 

A committable group can be returned by the system as a 
valid set of users who can be matched together. All queries 
in such a group are called committed. 

Problem Statement: User coordination is now reduced 
to the problem of finding committable query groups. Note 
that the problem is online - the system cannot wait till all 
the queries have been submitted. 

Definition 2 (Group Coordination Problem). 
Given a stream of enmeshed queries Q = q\, qi, . . . ,q n , find 
sets of committable query groups such that the number of 
committed queries is maximized. We denote by OPt(Q) 
the offline optimal, or the maximum number of committable 
queries that can be committed if all the queries are known 
upfront. Our goal is to design an algorithm ALG such that 
for any finite subsequence Q of Q, the number of queries 
committed (alg(Q) ) is as close to OPt(Q) as possible. 

While maximizing the throughput (number of committed 
queries) is our main goal, we will also experimentally mea- 
sure query processing time, latency (measured from when a 
query enters to when it was actually committed as a group) 
and fairness with respect to group size of our algorithms. 

4. COMPLEXITY 

We show complexity results for the Offline and Online 
group coordination problems. 

4.1 Offline Problem 

In this section we show that finding the offline optimal 
OPT(Q) for the group coordination problem is NP-hard, 
whether we want to maximize the number of committed 
queries or groups. When maximizing queries, or groups, we 
show the problem is NP-hard even when there is a single 
coordination attribute. 

Lemma 1. Given a set of enmeshed queries Q with a sin- 
gle coordination attribute A, computing the maximum num- 
ber of committed groups is NP-hard. 

Proof, (sketch) We show hardness via a reduction to the 
well known NP-hard problem of finding the largest indepen- 
dent set in a graph [2]. 

Given an instance G — (V, E) of the max independent 
set problem, we construct an instance of the group commit 
problem as follows. There is a single attribute A, which has 
one value a v for every node v £ V; i.e., \A\ = \V\. The set of 
queries Q contains one query q e for every edge e = (u, v) £ 
E, with constraints (A £ {u,v} A cardinality £ {d u ,d v }), 
where d u is the degree of the node u. 



Note that in this construction, the only committable groups 
are sets of edges S u that are incident on a node u in the 
graph G. Moreover, if S u commits, then for all v adjacent 
to u, S v can not commit. Therefore, every feasible set of 
committed groups corresponds to an independent set in the 
graph. □ 

Lemma 2. Given a set of enmeshed queries Q with a sin- 
gle coordination attribute A, computing the maximum num- 
ber of committed queries is NP-hard. 

Proof, (sketch) We show hardness via a reduction to 
the maximum 3-dimensional matching problem, a classic 
NP-complete problem [2]. An instance of the 3-D match- 
ing problem consists of disjoint sets X, Y, Z and a subset 
TClxYx2of triples. M C T is called a matching if 
for any two distinct triples (x, y, z), (x' , y' , z') G M, we have 
x ^ x', y 7^ y' and z / z' . 

Given an instance of 3-D matching, we construct an in- 
stance of the group commit problem as follows. There is a 
single coordination attribute A whose domain is X x Y x Z. 
Without loss of generality, we can assume that every element 
inuGlUFUZ appears in at least one triple in T (other- 
wise 0-degree elements can be removed from the problem). 
With every element v, we associate a query q v having a car- 
dinality constraint of 3, and a coordination A £ S, where S 
is the set of triples that contain v. 

Every committable group corresponds to a unique triple 
(x,y,z). Moreover, since no two committed groups can 
share a query, the set of committed groups gives a 3-D 
matching. Finally, the proof follows from the fact that the 
number of queries is just 3 times the number of committed 
groups. □ 

4.2 Best-Effort Online Coordination 

In this section, we consider a special class of online algo- 
rithms. An algorithm is best-effort if when a query q enters 
the system at most one committable group is returned and 
any returned committable group includes q. It may not 
return a committable group even when one exists. By con- 
trast, an algorithm is optimal best-effort if when a query q 
enters the system exactly one committable group is returned 
that includes q, if such a committable group exists. 

Best-Effort algorithms are attractive because they ensure 
constant progress. Second, best-effort algorithms are k- 
competitive, where k is the maximum cardinality constraint 
of a query. That is, the number of queries committed by a 
best-effort algorithm is at least 1/fe times the number of 
queries committed by an offline optimal algorithm. The 
competitive result follows because for any group output by 
the best-effort algorithm, in the worst case, the offline opti- 
mal might have used each query in the group to commit a 
separate group. However, optimal best-effort algorithms are 
hard to design. 

Lemma 3. Best-effort algorithms are k-competitive. 

Proof. The following construction shows that the com- 
petitive ratio is at least k. Consider k 2 queries which are 
numbered qi , . . . , q k i in the order they enter the system. 
Queries gi, . . . ,qu form a committable group. Additionally, 
qi and g i(fe _i) +2 , . . . , q( i+1 )( k -i)+i form k different commit- 
table groups for i = 1 to k. Figure [l] shows the construction 
for k = 4; nodes are queries, edges represent whether or 
not queries match. All queries have cardinality constraint 




Figure 1: Example illustrating the competitive 
ratio of best-effort algorithms. Nodes are queries, 
edges represent whether or not queries match. All 
queries have cardinality constraint = 4. The num- 
bers also represent the order in which queries enter 
the system. 

= 4. More specifically, for queries 1-4, qi's constraint is 
A G {x,i}. For queries connected to g, (i = 1,2,3,4), the 
constraint is A € {i}. 

An optimal solution is to commit (1,5,6,7), (2,8,9,10), 
(3, 11, 12, 13) and (4, 14, 15, 16) (the k groups in the general 
case). However, a best effort algorithm would greedily com- 
mit (1,2,3,4), (qi,...,qk in the general case). Thereafter 
no other query can commit. Thus the competitive ratio is 
at least k. 

For any group output by the best-effort algorithm, in the 
worst case, the offline optimal might have used each query 
in the group to commit a separate group. Since the max 
cardinality constraint is k, the competitive ratio is < k. □ 

However, we next show that best-effort algorithm also are 
hard to design. When a new query comes in, best-effort al- 
gorithms need to find a committable group if there is one. 
But, this problem is NP-hard in the presence of join con- 
straints. In the absence of join constraints, we present a 
PTIME best-effort algorithm which will be the scaffolding 
for our efficient algorithms presented in the next section. 

Lemma 4. Computing an optimal best-effort solution is 
NP-hard, even if the number of attributes is a constant. Sup- 
pose the number of attributes is a constant. Ensuring that 
an algorithm always finds a committable group if there is a 
committable group is NP-hard. 

Proof, (sketch) Let S be the set of query groups that 
can be committed due to adding q to Q. Suppose there do 
not already exist any committable groups in Q. Hence, for 
all groups S G 5, q G S. 

It is easy to verify that the problem is in NP. Given a 
group, we can efficiently check whether it is committable. 

To show it is hard we present a reduction from the clique 
decision problem: the problem of determining whether there 
is a clique of size k in a graph G — (V, E) is well known to 
be NP-hard. Given an instance of the clique problem, we 
construct an instance of our problem with a single attribute 
A. For every edge e = (u, v) G E, we have a value a e G A. 
We use e to represent both (u, v) or (v,u). 

We construct one query q v for every v G V, and each 
query has cardinality equal to (k + 1). The queries are such 



Algorithm 1 Optimal Algorithm (No Join Constraints) 



Algorithm 2 BMA: Basic Matching Algorithm 



1: Input: A set of enmeshed queries Q, a new query q 
2: Output: true, if 3 committable group containing q, 
3: false, otherwise. 

4: Let P C dom(Au U .Ac), s.t., Vp £ P, p satisfies q 
5: for p £ P do 

6: // STEP 1: Identify queries whose coordination con- 
straints are satisfied by p. 
7: Let S p be the set of queries satisfied by p. 
8: 

9: // STEP 2: Identify a subset that satisfies the cardi- 
nality constraints. 

10: Let C denote a sorted list of numbers between lb q 
and ub q (inclusive) that also correspond to cardinality 
limits of queries in Sp 

11: for consecutive numbers Ci,Ci+i £ C do 

12: Let Si denote the set of queries which allow coor- 
dination with size in [c;,c;+i]. 

13: if Cj < \Si\ < c i+ i then 

14: return true 

15: end if 

16: end for 

17: end for 

18: return false 



1: Input: A set of enmeshed queries Q, a new query q, an 

inverted index Pindex 
2: Output: A committable group C (and Q <— Q — C), 
3: or null (and Q <— Q U {q})- 

4: Let P C dom(Au U Ac), s.t., Vp 6 P, p satisfies q 
5: for p £ P do 

6: // STEP 1: Identify queries whose selection constraints 

are satisfied by p. 
7: PQSp <— Pindex. lookup(p) 

8: CQSp <— subset of queries in PQSp satisfying p 
9: 

10: // STEP 2: Identify subset of queries whose join and car- 
dinality constraints are satisfied 
11: Set S p *- {q} 

12: for q' £ CQSp // in random order do 

13: if (q 1 matches Vq £ S p ) then 

14: Sp <- Sp U {<?'} 

15: C <— FindCommittableGroup(S p ) 

16: if G is not null then 

17: Remove all queries in C from Q and Pindex. 

18: Return C 

19: end if 

20: end if 

21: end for 

22: end for 

23: // No committable group found 

24: Add q to Query Table and to Pindex. 



that (q v ,qu) matches if and only if (v,u) £ E. This can 
be achieved just by constructing an ID attribute Aid and 
a "friend" attribute Af rien d- For a query q v , Aje, = v and 
Af r i en d = {u\(u,v) £ E} . Finally, we include a join con- 
straint, that. Aid £ Af r i en d . This require that each query 
q v be satisfied only by a q u m, where (u, v) £ E. 

We have a new query q that matches all queries we con- 
structed (by not having any constraint for q) and also having 
a cardinality k + 1. If we find a committable group in the 
new instance, we get a clique in the original graph. Thus 
the problem is NP-hard. □ 

In the absence of join constraints, there is a simple best- 
effort algorithm (Algorithm [T]) to determine if there is a 
committable group in the graph when a new query q enters 
the system. We first identify all points p in dom(Ac) that 
satisfy q. For each p, we compute the set S p of queries 
which are jointly satisfied by p. Finally, we can efficiently (in 
time linear in \S P \) compute a committable group of queries 
(which satisfy the cardinality constraints) as follows: divide 
the number line into segments determined by the upper and 
lower bounds on cardinality for all the queries in S p . For 
each segment (I, it), compute the queries S^u) that permits 
groups of size between I and u. If Z < |£>n«)| < u, then 
S(i_ u ) can be returned as a committable group. If there is 
no such group, then the result is empty. Algorithm [I] takes 
0(\Q\ ■ |P|) time, where P C dom(Ac) is the set of points 
that satisfy q. 

5. SCALABLE GROUP COORDINATION 

Given that optimal best-effort algorithms are NP-hard, 
we present four heuristic algorithms that will work well on 
workloads satisfying the following assumptions: 

Al. High Selectivity: We assume most queries are selec- 
tive, and hence, can be indexed based on some coordination 
attributes. First, people know what they want: queries like 



"play any sports at any time and anywhere" are not typi- 
cal. Second, selectivity can be enforced by application user 
interfaces that disallow wildcards, and enforce limited dis- 
junction (e.g., at most two locations allowed in a query). 

A2. Small Group Coordination: We also assume that the 
average cardinality of queries are small, e.g. < 50. This 
ensures that queries do not wait too long to be committed. 

All our algorithms use the following structure. They all 
start by using a subset of the most selective attributes in a 
query as an index to quickly find a (hopefully small) "short 
list" of potentially matching queries. For example, out of a 
million concurrent enmeshed queries, only 50 queries in the 
system may specify 'playing Tennis in Cupertino at 3 PM 
next Tuesday'. While the first pruning step only considers 
selection constraints, the second step traverses the "short 
list" to greedily attempt to build a committable group while 
also incorporating join and cardinality constraints. 

5.1 Basic Matching Algorithm (BMA) 

Basic matching algorithm, BMA (Algorithm is a best- 
effort algorithm that traverses the "short list" in random 
order, and stops when it finds the first committable group. 

Given a query q, BMA iterates over all possible points 
p £ Ac that satisfy the selection constraints of q in ran- 
dom order. For each p, it first finds a set of queries (PQS P ) 
that partially match q using Pindex, which is an in-memory 
inverted hash index defined over a subset of coordination 
attributes (e.g., location and time). This index (built using 
techniques in [l]) can support hierarchical values, e.g. week- 
day instead of dates. Since Pindex is only on a subset of 
attributes, not all queries in PQS P are jointly satisfied by 
satisfy p. Hence, next a subset of queries satisfying p, called 
CQSp, is computed. CQS P is the "short list" we referred to 
earlier. If assumption Al holds, ICQS'pl will be small. 

However, not all pairs of queries in CQSp may match 
according to the join constraints. Hence, BMA next greed- 
ily computes S p C CQS P , a subset of queries that match 



(and satisfy p) as follows: Starting with S p = {q}, BMA 
iteratively adds a randomly chosen query q' £ CQSp to S p 
if it matches every other query in S p (based on join con- 
straints). As the following example illustrates, there may 
be many choices for S p for the same CQS P . 

EXAMPLE 1. Consider queries with a user attribute A ra ting G 
{1, 2, . . . , 5}. Let q be a query with no join constraints, with 
A ra ting = 3. Suppose CQS p consists of gi,q2,?3, such that 
qi-Arating = 4, q2-A ra tin g = qs,-A ra tin a = 2, and card(q2) — 
card(qs) = 3. Suppose qi has a join constraint that. A ra ting < 
A ra ting ( only play with lower ranked players ), and 92 , 93 have 
join constraints that. A ra ung = A ra ting + 1 (only play with 
players ranked 3). Then if the queries are considered in the 
order 51,52,93, the resulting S p = {q, qi}. If qi or §3 is 
considered before qi, then the resulting S p = {q,q2,qs} ■ 

Every time a new query is added to S p , the FindCom- 
mittableGroup subroutine attempts to find a committable 
group that satisfies cardinality constraints as follows: Every 
query added to S p is also added to an inverted index from 
possible group sizes to queries, called Cindex. A query with 
multiple group sizes appears multiple times in Cindex, one 
for each group size. BMA iterates over Cindex to check 
whether there is some m such that there are > m queries in 
S p that permit a group of size m. BMA picks some group of 
m queries. Assumption A 2 ensures that the average num- 
ber of iterations over Cindex is small. If a group is formed, 
all queries from the group are removed from the system. 
Otherwise, we add current query q into the Cindex. 

Example 2. For instance, suppose 91,92,93,94,95 are match- 
ing queries in S p with cardinality constraints 2, [2, 3], [2, 3], [2, 3] 
and 3 respectively. Then Cindex will contain 2 entries: 
2 -> {qi, g 2 , <?3, 94} and 3 -> {g 2 , <?3, 94, qs} ■ Any pair from 
{91,92,93,94} or any subset of size 3 from {92,93,94,95} are 
committable. FindCommittableGroup will return some 
size 3 subset of {92, 93, 94, 9s}- 

5.2 Extensions to BMA 

BMA randomly traverses the "short list" and returns the 
first committable group it finds. The next three algorithms 
extend BMA by more carefully choosing a committable group 
(among many alternatives), delaying committing to improve 
throughput, and by using more intelligent traversal orders. 

5. 2. 1 NES: No Early Stop 

BMA may choose to commit a group even before all the 
queries in CQS P are seen. NES, on the other hand, com- 
putes S p by considering every query in CQS, and then calls 
FindCommittableGroup to find the largest subset of S p 
satisfying cardinality constraints. In Example [2j suppose 
the are considered in order of their indices. 

BMA will commit only 2 queries, (91,92) after seeing the 
first two queries. On the other hand, NES waits to see all 
queries and commits (92,93,94). 

5. 2. 2 DELAY: Delayed Matching 

Both BMA and NES are best-effort - when a new query 
arrives, if a group can be committed, it will be committed. 
However, delaying matching smaller groups may help find a 
larger group later. 



Example 3. Consider again Example^ Suppose queries 
come in the order of their index. After 91 and 92 enter the 
system, any best-effort algorithm would commit the group 
(91,92). Similarly, after 93 and 94 enter the system, any 
best-effort algorithm would commit the group (93,94). How- 
ever, if we waited till 95 came into the system, NES can 
commit (93,94,95), thus committing all queries. 

We implement DELAY as follows. Let the age of a query 
denote the time elapsed since it was inserted into the sys- 
terrQ When a query first arrives at time t, we compute its 
CQSp, and proceed to find a committable group only if the 
average age of the CQSp is greater than a threshold r. Oth- 
erwise, 9 is inserted into a scheduler, akin to a timing wheel, 
to be reevaluated at time t + r. A query is reevaluated at 
most once. DELAY can increase processing time and query 
latency, as some queries are evaluated twice. If at most m 
queries simultaneously enter the system, DELAY returns no 
more than 2m committable groups: < m groups from new 
queries and < m from delayed queries. 

5.2.3 Query Matchibility 

All of the previous algorithms may choose to commit a group 
that is not optimal in terms of future queries. 

Example 4. Consider queries that have an attribute Adhere 
with domain {Cupertino, Sunnyvale, Campbell} . Suppose 
Adhere = Campbell is very rare. Consider a new query q 
with A w here = Campbell and cardinality constraint [2,3]. 
Suppose its CQS contains three queries 91,92,93 with 91,92 
having A w here G {Cupertino, Campbell} and 93 requiring 
A w here £ {Campbell}, resp. Moreover, 91,92 have cardi- 
nality constraint 3, while 93 requires 2. Then, returning the 
group (q,qs) is better than returning (9,91,92) (even though 
the latter is larger), since it is more likely that a Cupertino 
query will arrive in the future than a Campbell query. 

We call the propensity of a query to be matched as its 
matchibility, and is intuitively the expected number of other 
queries that will match 9. Intuitively, a query has low 
matchibility either because its selection or join constraints 
are hard to satisfy, or because it has large cardinality. We 
use a simple online approach to estimate matchibility. 

We define the matchibility of a query 9 as the number 
of times q appears in a CQSp before it is committed. Intu- 
itively, a query that appears more frequently in the CQSp of 
other queries should have a higher chance to find matches. 
The initial value of a query's matchibility is 

m(g) = n q + c/lb q 

where c is a constant, lb q is the lower bound on the cardi- 
nality constraint in the query, and n q — ub q — lb q + 1 is the 
number of possible cardinality values that satisfy 9. For ex- 
ample, if a query has a cardinality constraint [4, 6], then its 
initial matchibility is set to c/4 + 3. The motivation is that 
smaller groups are more matchable; further, a wide range 
of cardinalities is more matchable. Every time 9 appears in 
some other query's CQSp, 771(9) ls incremented by 1. 

We consider two simple heuristics - high matchibility first 
(HMF) and low matchibility first (LMF) that traverse the 
"short list" CQSp in matchibility order. Intuitively, LMF 

1 We assume each query is inserted at a new time instant. 
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Table 2: Parameter distribution for generating users 
in our synthetic workload. 
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Table 3: Parameter distribution for generating 
queries in our synthetic workload. 

may match more queries than HMF by reserving popular 
queries for later consideration. 

Discussion: The 3 extensions presented in this section are 
orthogonal, and an algorithm can be implemented with some 
or all of them together. However, given that DELAY post- 
pones queries for later consideration, it seems unreasonable 
to not consider all queries. Hence in our implementation, 
DELAY always implies NES, but not vice versa. 

6. EVALUATION 

In this section we evaluate our algorithms on synthetically 
generated workload for an example sports coordination ap- 
plication. 

6.1 Example Application 

We describe our evaluation in terms of an example sports 
coordination application. In this application, users want to 
find sports partners that live nearby and with similar rat- 
ings. There are two user attributes - home location, rat- 
ing. There are four coordination attributes - location, time, 
sport and opponent rating. Each user specifies selection 
constraints on the desired location(s), desired time(s), and 
sport. Additionally, a user may specify what the opponent's 
rating must be - this can be captured using a join constraint. 
For instance, a user wanting to play tennis in Cupertino or 
Sunnyvale at 8 PM with 2,3 or 4 people with a rating 5 can 
be written as the following query: 

Aume. G [8PM] A Aiocation G {Cupertino, Sunnyvale} A 
A s port G {Tennis} A that. A raUng = 5 A card(q) G [2,4] 

6.2 Metrics and Setup 

We implemented all the algorithms presented in the paper, 
namely BMA, NES, DELAY, LMF and HMF. We compare 
our group coordination algorithms using system measures, 
throughput and query processing time, as well as user mea- 
sures, latency and fairness. These metrics are defined in 
Section[3] We use a single machine that is a 2 x Xeon L5420 
2.50GHz running 64bit RHEL Server 5.6 with 16 GB mem- 
ory. 

6.3 Synthetic Workload 

Our default data trace contains 1 million queries for 200,000 
unique users. Each user has have about 5 queries in the 
workload since each user has a equal chance to be chosen 
for a query. We also enforce that no more than one query 



from the same user can share a time slot. Each query in the 
workload contains a monotonically increasing time stamp, 
desired location(s), desired time(s), action, desired ratings 
for potential matching candidates and a range of desired 
group size. We now specify details for each parameter. The 
parameters are also summarized in Tables [2] and [3] All Zipf 
distributions use an exponent parameter of 1 where the sec- 
ond most common frequency occurs 1/2 as much as the first, 
etc. 

Location: Our workload generator generates 100 locations, 
represented by a two dimensional array location [10] [10] . Each 
user is assigned a location (i.e. location[k][j]) as his/her 
home location based on a Zipfian distribution that models 
the fact that some locations (e.g., San Francisco) have more 
users than others (say Brisbane, CA). Besides the home lo- 
cation, a user query can also choose from among 4 neighbor- 
ing locations: if a user is at location [k], [j], the neighboring 
locations are [fc+l][j], [fc-l][j], [fc][j + l], W[j-1]- This mod- 
els the fact users that only want to play sports in locations 
"near" their home locations. We allow at most 2 locations 
specified in a query using a Zipfian distribution wherein it 
is more probable to generate queries with 1 location (always 
home location) than 2 alternative locations (home location 
+ 1 choice made uniformly among the 4 neighbors). 

Time: Times are represented as hourly slots, e.g. 2/1/2012 
3pm. The domain is between 8am-8pm for the following 4 
weeks. We assign higher probability when generating slots 
on weekends than those on weekdays. Times in a query are 
chosen uniformly from slots in the domain based on their 
relative probabilities. No more than 2 time slots can be 
specified in a query; as in the case of location, we use a Zip- 
fian to choose the number of time slots specified in a query, 
with a higher probability of a query being generated with 1 
time slot than with 2 time slots. 

Actions and Ratings: Each user has a set plays up to three 
sports with the exact number being chosen using a Zipfian 
distribution, with one being the most probable. Once the 
number of sports is chosen, the specific sports played by a 
user is chosen uniformly at random from a set of 10 sports. 
We also assign a rating (r) for each chosen sport to represent 
a user's skill level, again using a Zipfian distribution. Similar 
to the case of location, we limit each query to have no more 
than 2 desired ratings, where most queries will have 1 rating 
(equal to r), some have two ratings (r and either r + 1 or 
r — 1) where the frequency of 1 versus 2 is chosen by a 
Zipfian. While a user can play at most 3 sports, each user 
query picks a single sport uniformly at random from among 
the set of sports the user plays. 

Group sizes: Group size generation is described in the 
next section. 

6.4 Finding a yardstick for optimality 

Given that finding an optimal solution to enmeshed queries 
is NP hard, a major challenge is measuring how close our 
matching algorithms come to the optimal solution. In ad- 
dition, since the total space of attributes (e.g., the cross- 
product of location, time etc) is large, if we get a low per- 
centage of queries matched by our algorithms, we cannot de- 
termine whether the low match percentage is caused by inef- 
fective algorithms or because the workload generated sparse 
points in a large domain space. To solve this dilemma, as 
we said earlier, we generate data traces that only contains 
queries belong to pre-matched groups. The advantage of 



this approach is that we know an optimal solution is able 
to find matches for all queries in the workload, which then 
serves as a yardstick for our heuristic algorithms. 

More precisely, to generate queries with pre-matched groups, 
we first generate a group signature that contains a single 
value for each dimension and a group size (k). We then se- 
lect a set of qualified users using an inverted index that maps 
from (location, sport, rating) to users. Next, we generate k 
queries that are compatible with the group signature from 
the set of qualified users by following distributions described 
in the workload generation description. 

We generate a query's group size as follows. A pre-match 
group size m is first generated using a Zipfian distribution 
on the domain [2, 12] with group sizes of size 2 being most 
frequent. Next, for each query, we generate a random value 
between 2 and m as the low end of the group size range 
in that query. We then generate the query group size as 
Min(2 * (m — I) + 1, 12 — I + 1), where 12 is the maximum 
group size. This serves to make m the center of each group 
size range in each prematched query. 

In the rare case that we cannot find enough queries to sat- 
isfy a group signature, we generate a new group signature 
and continue. The process terminates when a specified total 
number of queries have been generated. In order to spread 
queries from pre-matched groups across the workload, we 
define a group interval as a parameter that controls ran- 
domization. For example, if the group interval is 5000, our 
workload generator will generate 5000 pre-matched groups 
at a time and then randomly shuffle all queries within each 
range of 5000. 

To serve as a yardstick, we designed an "optimal" algo- 
rithm (OPT) that is only required to scan the workload 
once and find matches based on pre-matched group id and 
group size information embedded in the query trace. This 
information, of course, is ignored by the regular matching 
algorithms. The match percentage for the "optimal" solu- 
tion is 100%. The average query latency for OPT varies 
when the group interval changes. However, it should always 
be less than the group interval since a match is guaranteed 
to be found within a group interval in OPT. 

6.5 Experimental Results 

In this section, we describe and interpret the experimental 
results to identify the best performing heuristic algorithms. 
Recall that BMA is the basic matching algorithm that pro- 
cesses the set of potential matches (CQS) in random order 
but stops after finding the first match; LMF and HMF are 
also early stopping algorithms, but they process the CQS 
in the order of least matchable (respectively highest match- 
able); finally NES is an algorithm that processes the entire 
CQS even after finding an initial match to search for larger 
matches. 

We compare these algorithms in terms of system measures 
(percentage of queries matched and query processing time), 
and user metrics (average query latency and fairness mea- 
sured by average group size). Suppose, for example, that 
a matching algorithm has a group size 2 query it can im- 
mediately satisfy, but also a compatible group size 4 query 
that is compatible but not immediately satisfiable. The al- 
gorithm has to exercise forbearance in order to satisfy the 
size 4 request in the future; greedy choices, by contrast, can 
have good match percentage but poor fairness. 

Performance with Varying Group Interval: Recall that 



the group interval is a parameter of the workload gener- 
ator that controls how scrambled the workload compared 
to a pre-matched set of queries that "seeds" the workload. 
Clearly, an omniscient, optimal algorithm will be able to 
match all queries in such a workload because this workload 
is a randomization of a workload where there is a match for 
all queries. However, our algorithms such as BMA will do 
worse than optimal because they are online and not offline, 
and make heuristic choices to reduce computation. Fig- 
ure[2]shows the percentage of queries matched by our various 
heuristics. The hypothetical offline optimal algorithm is not 
shown because it always achieves 100% matching. 

The figure shows that the percentage of matches does not 
change perceptibly as the group interval (scrambling dis- 
tance) increases for all algorithms. It also shows NES has 
the highest match percentage (around 81%) while HMF per- 
forms worst (around 76%) and the baseline BMA is slightly 
better (77%). This is not surprising because NES tries to 
find the best match without stopping. Intuitively, HMF 
is bad because it processes highly matchable queries early 
which makes it less likely that later queries in the CQS will 
be matched. Given early stopping, LMF does better because 
it does not squander more matchable queries when other 
queries will do instead; a random order performs in between 
LMF and HMF. Note that the difference between 76% and 
81% may seem small but if there is revenue attached to each 
match, a 5% uptick in revenue is appreciable. 

NES gets a higher match percentage by processing the 
CQS more thoroughly. Thus we might hypothesize that 
NES will require more processing time in return for a higher 
match percentage. Figure [3] is perhaps surprising because 
it shows that NES has the smallest average processing time 
(for example, around 40 /isec at a group interval of 2000) 
while HMF has close to 50 fisec and the others are in be- 
tween). Intuitively, this is because NES process more queries 
on average per scan of the CQS and hence removes queries 
early from the CQS; this in turn requires less scanning over- 
head in the future. Note also that the processing time in- 
creases slightly with scrambling distance; this makes sense 
because the more far apart potentially matchable queries 
are placed in the workload, the greater the average length 
of the CQS and hence the processing time. 

While NES does well from the system provider's point of 
view, what about from the user point of view? Figure [4] 
shows the average latency (measured in terms of queries 
after which a user request is satisfied) for the algorithms. 
Recall that the workload is generated from a pre-matched 
set of queries that are scrambled randomly within a group 
interval. The latency of a query Q for the optimal algorithm 
is the difference between the index of the last query in the 
pre-matched set for Q and the index of Q itself. Thus the la- 
tency for the optimal algorithm shown as a reference is very 
small and increases linearly with the group interval. Once 
again, NES has the lowest average latency (around 21,000 
queries) and HMF has the worst latency (around 25,000). 

While NES is the best latency, it is much worse than op- 
timal. Further scrutiny of the results revealed that 60% 
of queries in NES are fast because NES does find most 
pre-matched groups. However, because NES is a heuris- 
tic, in around 40% of the cases, NES misses finding its pre- 
generated (and hence optimal) groups. In that case, either 
the query will never be matched or, by random chance based 
on the workload, it can be matched later but after a much 
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higher latency whose average value is roughly the average 
time for a query to find another set of compatible queries. 
Thus the average latency is pushed very high by outliers. 
We calculated the median latency for NES at a group in- 
terval of 5000 as 135 and the 90% latency as around 7591 
compared to the optimal latency of 1658. Given that the 
outlier latency is largely an artifact of the workload genera- 
tion model, this artificial increased latency is unlikely to be 
an issue in practice. 

Finally, recall that in measuring fairness, we are trying 
to penalize algorithms that boost match probability by ar- 
tificially favoring some group sizes (say small groups) over 
others. Part of the novelty of our social coordination system 
is that it allows coordination among users with various size 
group requests (unlike say a dating service where the sizes 
are always 2). It would be unfortunate if this generality in 
the service specification was accompanied by service bias in 
terms of requested group size. We use the average size of the 
group when compared to the optimal algorithm as a fairness 
metric. Figure [5] shows the average group size returned by 
the various algorithms. 

Given the workload, the average group size of matches 
returned by the optimal algorithm is around 4.6. On the 
other hand, BMA is around 2.6 and NES is around 2.8. 
This again follows because NES is prepared to wait to get 
larger groups and hence is fairer to larger groups. 

The results so far show that NES is better than using 
matchability as a simple stopping criterion and certainly 
better than BMA with its random order and early stopping. 
However, despite this we believe that LMF is a promising 
heuristic and comes close to NES. For example, if the CQS 
gets too large, NES may be infeasible and the LMF heuristic 
may be needed at least for graceful degradation. 

Further, LMF particularly shines when there are work- 
loads with a significant fraction of very discriminating users 
that are hard to satisfy (e.g., want to play tennis in Camp- 
bell, CA at 6 am) and very easygoing users (willing to play 



tennis anywhere in the Bay Area at any time). A small mod- 
ification of Example 4 can be constructed by adding a fifth 
query qs that can satisfy q± and <j2 such that all 5 queries 
are satisfied by LMF but only 3 are satisfied by NES. Re- 
peating this sequence indefinitely leads to a workload where 
NES achives a match percentage of only 60% compared to 
100% for LMF. 

Performance of Delayed Evaluation: So far we have looked 
at the performance of the basic algorithm augmented with 
matchability and no early stopping. We now evaluate the 
effect of our second and more complex refinement: adding a 
fixed delay threshold. Recall that in delayed evaluation, for 
each query, we calculate the average age of other queries in 
its Compatible Set of Queries (CQS) and delay the process- 
ing of that queries if the average age is too small based on 
simple timer processing. How does the basic performance 
of all measures change as the delay increases? Intuitively, if 
the group interval (scrambling interval) is say 5000, delaying 
processing by around 5000 should make it more likely that 
a query will behave close to its optimal and the gains should 
fall off after that. 

This is indeed what we find. In all cases, we fix the group 
interval at 5000 and use the NES algorithm augmented with 
a delay parameter that varies from 10 to 10,000. Figure [6] 
shows that as the delay increases the percentage match in- 
creases from around 80% for NES (reference line) to around 
87% but the gains fall off after a delay of 5000. Figure [3] 
shows that the query processing time can increase with de- 
lay from around 40 /isec to around 50 ^isec because of the 
overhead of the delay processing. However, as we get closer 
to the optimal delay of 5000, the processing time falls back 
to around 40 fisec and is even slightly faster than NES. We 
hypothesize that the better match efficiency makes up for 
the increase in delay timer processing (which is a fixed over- 
head regardless of the magnitude of the delay). 

Figure [8] shows a more interesting tradeoff. As the de- 
lay increases, we see that the latency drops sharply at first 
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from 23,000 to around 15,000. This is because increasing 
the match percentage by 5% reduces the outliers by 5% 
which sharply decreases the average latency, Beyond the 
optimal delay of 5000, however, the match percentage does 
not increase and the delay is merely an artificial waiting 
penalty: thus beyond 5000 the average latency starts in- 
creasing again. 

The user will rightly complain that this is cheating. In a 
synthetic workload, we knew the group interval and hence 
could estimate the optimal delay. However, in a real de- 
ployment, the system can keep statistics such as Figure [8] 
by periodically varying the delay (akin to explore-exploit 
systems) to find the knee of the curve. The knee of the 
curve can be used to estimate the optimal amount of delay 
to be added. 

Finally, Figure [9] shows that delayed processing also de- 
creases the bias against large groups by coming closer to the 
average group size of the optimal algorithm. As the delay 
increases to the knee of the curve (5000), the average group 
size gets close to 3.5 which is much closer to 4.5 which rep- 
resents the ideal (optimal) compared to say 2.6 for BMA. 

Recommendations for Deployment: Our results suggest 
that a combination of DELAY and NES performs well across 
all measures. The system can find the smallest value of 
added delay after which matching performance does not im- 
prove significantly. LMF may be useful in workloads that 
have many discriminating users, and when the average CQS 
length become so long that NES becomes infeasible. 

In terms of scaling, an average of 40 /xsec implies 25,000 
queries a second on a single core using a high end 2.5 Ghz 
machine. However, throughput can be increased by forking 
multiple threads, leveraging multiple cores, and parallelizing 
query processing by region using multiple servers: users in 
Cairo are unlikely to coordinate with users in Cupertino. 

7. CONCLUSIONS 

Social coordination may represent the next step beyond 
the social awareness provided by today's Online Social Net- 
works (OSNs). We focused on the problem of fluid social 
coordination where the set of people involved is unknown 
at the start, may change quickly over time, and may not 
be part of one's friends in any OSN. Such fluid coordination 
allows forming weak ties [3] that can enrich our lives beyond 
the strong ties formalized by OSNs. 

In our formalism, users declaratively specify coordination 
objectives using selection constraints on coordination vari- 
ables, join constraints on pairs of user variables, and group 
size constraints. While declarative specification can reduce 



user effort compared to browsing, we recognize that, in some 
cases, browsing can allow more flexibility. We suggest the 
following research directions in social coordination: 

1. Economics: Enmeshed queries extend dating services 
to arbitrary group sizes. Some dating services provide better 
matches for users who pay more. What is the natural way 
to extend enmeshed queries to specify a willingness to pay 
for matches? How does this relate to auction theory? 

2. Soft Constraints: While we consider hard coordina- 
tion constraints, users might prefer to declare relative prefer- 
ences over attributes like place and time. How can enmeshed 
queries be extended to handle such "soft-constraints"? 

3. Recommendations: Amazon and NetFlix recommend 
new choices based on past selections. On what basis should 
a coordination system recommend groups to users? A user 
that often plays tennis may like to hear about 3 nearby, 
compatible tennis players who wish to find a fourth player. 

4- Query Flexibility: Our simple model of a conjunction of 
disjunctions can be generalized. For example, in ad match- 
ing n] a richer set of boolean queries can be expressed at 
the possible cost of increased computation time. Further, 
our paper uses a completely declarative (system chooses) 
model and allows no browsing (user chooses) as in Meetup. 
What is the best way to combine browsing and declaration? 

5. Data Mining: Social coordination tools may allow so- 
cial scientists to answer questions about the sociology of co- 
ordination. What are the metrics (e.g., fraction of success- 
ful coordinations, keystrokes to coordination completion) by 
which one judges a successful social coordination? A social 
scientist could test hypotheses about successful coordina- 
tions such as "small groups are more likely to find matches" . 

Social coordination fulfills a basic human need to connect 
to other human beings. Further, fluid social coordination 
allows building weak ties to a larger set of people than our 
friends. While such weak ties classically occur by serendip- 
ity as in meetings at the proverbial water cooler, enmeshed 
queries may help institutionalize such serendipity. While we 
have taken a small step in formalizing aspects of the prob- 
lem, the vista for further work appears inviting: the solution 
space can be enriched if social scientists, database theorists, 
economists, and system designers all join the conversation. 
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